Broadcast News Transcription System
نویسندگان
چکیده
This paper describes the 1998 CMU Hub 4 Spanish broadcast news transcription system. We focus on the development and improvements of the system with respect to the 1997 system. Both the 1997 and 1998 systems were developed using exactly the same acoustic and language model training material, thus the improvements obtained resulted from a better utilization and modeling of these corpora and a better decoding configuration and strategy. Specifically, we employed several language models, a larger lexicon and larger acoustic models than our 1997 system. Due to these improvements, we achieved a reduction of 28% in the error rate on this year’s development material.
منابع مشابه
Online Temporal Language Model Adaptation for a Thai Broadcast News Transcription System
This paper investigates the effectiveness of online temporal language model adaptation when applied to a Thai broadcast news transcription task. Our adaptation scheme works as follow: first an initial language model is trained with broadcast news transcription available during the development period. Then the language model is adapted over time with more recent broadcast news transcription and ...
متن کاملAdvances in automatic transcription of Italian broadcast news
This paper presents some recent improvements in automatic transcription of Italian broadcast news obtained at ITCirst. A first preliminary activity was carried out in order to develop a suitable speech corpus for the Italian language. The resulting corpus, formed by recordings covering 30 hours of radio news, was exploited for developing a baseline system for transcription of broadcast news. Th...
متن کاملJapanese broadcast news transcription
In this paper, we describe the on-going development of a Japanese Broadcast News Transcription system at BBN Technologies. This is a collaboration between BBN and NHK to use automatic speech recognition technology to provide live closed caption for NHK’s TV news programs in Japan. We describe what the NHK Broadcast News Corpus comprises and how we adopted transcription technology developed for ...
متن کاملReal-time recognition of broadcast news
Although the performance of state-of-the-art automatic speech recognition systems on the challenging task of broadcast news transcription has improved considerably in recent years, many of the systems operate in 130-300 times real-time [1]. Many applications of automatic transcription of broadcast news, eg. closedcaption subtitles for television broadcasts, require real-time operation. This pap...
متن کاملSpeech retrieval of Mandarin broadcast news via mobile devices
This paper presents a system for speech retrieval of Mandarin broadcast news. First, several data-driven and unsupervised approaches are integrated into the broadcast news transcription system to improve the speech recognition accuracy and efficiency. Then, a multi-scale indexing paradigm for broadcast news retrieval is proposed to make use of the special structural properties of the Chinese la...
متن کاملThe CUHTK-Entropic 10xRT Broadcast News Transcription System
This paper describes the development of the CUHTK-Entropic 10xRT Broadcast News Transcription System. Previous HTK broadcast news transcription systems have focused on maximising accuracy with few constraints on compute power available. In order to develop a system running in under 10 times real time on a single CPU, detailed investigation and optimisation of the system architecture and mode of...
متن کامل